-
Notifications
You must be signed in to change notification settings - Fork 916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace direct cudaMemcpyAsync
calls with utility functions (within /include
)
#17557
Replace direct cudaMemcpyAsync
calls with utility functions (within /include
)
#17557
Conversation
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
…remove-memcpy-include
…into remove-memcpy-include
cudaMemcpyAsync
calls with utility functions (within /include
)
Co-authored-by: David Wendt <[email protected]>
return cudf::detail::make_host_vector_sync( | ||
device_span<T const>{col_view.data<T>() + element_index, 1}, stream).front(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quibble: In some alternate universe where pinned memory isn't being used and we drop back to the pageable resource, this is probably somewhat worse than what's there now. I imagine that's not a case we really care about though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true. FWIW, we did a full benchmark run with a change like this in the device_scalar
, and there was no negative impact from involving the heap unnecessarily.
/merge |
Description
Replaced the calls to
cudaMemcpyAsync
with the newcuda_memcpy
/cuda_memcpy_async
utility, which optionally avoids using the copy engine.Also took the opportunity to use
cudf::detail::host_vector
and its factories to enable wider pinned memory use.Checklist